34 research outputs found

    Qserv: a distributed shared-nothing database for the LSST catalog

    Get PDF
    The LSST project will provide public access to a database catalog that, in its final year, is estimated to include 26 billion stars and galaxies in dozens of trillion detections in multiple petabytes. Because we are not aware of an existing open-source database implementation that has been demonstrated to efficiently satisfy astronomers' spatial self-joining and cross-matching queries at this scale, we have implemented Qserv, a distributed shared-nothing SQL database query system. To speed development, Qserv relies on two successful open-source software packages: the MySQL RDBMS and the Xrootd distributed file system. We describe Qserv's design, architecture, and ability to scale to LSST's data requirements. We illustrate its potential with test results on a 150-node cluster using 55 billion rows and 30 terabytes of simulated data. These results demonstrate the soundness of Qserv's approach and the scale it achieves on today's hardware

    REPORT FROM THE 2nd WORKSHOP ON EXTREMELY LARGE DATABASES

    Get PDF
    在科学界和业界,大规模分析的复杂性已经在近些年有了很大的提升。分析人员正在努力尝试使用复杂的技术,比如时间序列分析和分类算法,因为他们平时所熟悉的工具,虽然功能强大,但是可扩展性较差,无法有效使用可扩展的数据库系统。第2届XLDB大会,主要目的在于了解这些存在的问题,剖析这些问题的背后原因,并寻找相应的解决方案。大会还讨论了建设一个新的开源科学数据库SciDB,这个构想是在第1届XLDB大会(XLDB2007)上提出来的。本文是本次大会活动和讨论的总结报告

    On the Verge of One Petabyte - the Story Behind the BaBar Database System

    Full text link
    The BaBar database has pioneered the use of a commercial ODBMS within the HEP community. The unique object-oriented architecture of Objectivity/DB has made it possible to manage over 700 terabytes of production data generated since May'99, making the BaBar database the world's largest known database. The ongoing development includes new features, addressing the ever-increasing luminosity of the detector as well as other changing physics requirements. Significant efforts are focused on reducing space requirements and operational costs. The paper discusses our experience with developing a large scale database system, emphasizing universal aspects which may be applied to any large scale system, independently of underlying technology used.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 6 pages. PSN MOKT01

    Agile software development in an earned value world: a survival guide

    Get PDF
    Agile methodologies are current best practice in software development. They are favored for, among other reasons, preventing premature optimization by taking a somewhat short-term focus, and allowing frequent replans/reprioritizations of upcoming development work based on recent results and current backlog. At the same time, funding agencies prescribe earned value management accounting for large projects which, these days, inevitably include substantial software components. Earned Value approaches emphasize a more comprehensive and typically longer-range plan, and tend to characterize frequent replans and reprioritizations as indicative of problems. Here we describe the planning, execution and reporting framework used by the LSST Data Management team, that navigates these opposite tensions

    Designing a Multi-Petabyte Database for LSST

    Get PDF
    The 3.2 giga-pixel LSST camera will produce approximately half a petabyte of archive images every month. These data need to be reduced in under a minute to produce real-time transient alerts, and then added to the cumulative catalog for further analysis. The catalog is expected to grow about three hundred terabytes per year. The data volume, the real-time transient alerting requirements of the LSST, and its spatio-temporal aspects require innovative techniques to build an efficient data access system at reasonable cost. As currently envisioned, the system will rely on a database for catalogs and metadata. Several database systems are being evaluated to understand how they perform at these data rates, data volumes, and access patterns. This paper describes the LSST requirements, the challenges they impose, the data access philosophy, results to date from evaluating available database technologies against LSST requirements, and the proposed database architecture to meet the data challenges

    Agile software development in an earned value world: a survival guide

    Get PDF
    Agile methodologies are current best practice in software development. They are favored for, among other reasons, preventing premature optimization by taking a somewhat short-term focus, and allowing frequent replans/reprioritizations of upcoming development work based on recent results and current backlog. At the same time, funding agencies prescribe earned value management accounting for large projects which, these days, inevitably include substantial software components. Earned Value approaches emphasize a more comprehensive and typically longer-range plan, and tend to characterize frequent replans and reprioritizations as indicative of problems. Here we describe the planning, execution and reporting framework used by the LSST Data Management team, that navigates these opposite tensions

    LSST Science Book, Version 2.0

    Get PDF
    A survey that can cover the sky in optical bands over wide fields to faint magnitudes with a fast cadence will enable many of the exciting science opportunities of the next decade. The Large Synoptic Survey Telescope (LSST) will have an effective aperture of 6.7 meters and an imaging camera with field of view of 9.6 deg^2, and will be devoted to a ten-year imaging survey over 20,000 deg^2 south of +15 deg. Each pointing will be imaged 2000 times with fifteen second exposures in six broad bands from 0.35 to 1.1 microns, to a total point-source depth of r~27.5. The LSST Science Book describes the basic parameters of the LSST hardware, software, and observing plans. The book discusses educational and outreach opportunities, then goes on to describe a broad range of science that LSST will revolutionize: mapping the inner and outer Solar System, stellar populations in the Milky Way and nearby galaxies, the structure of the Milky Way disk and halo and other objects in the Local Volume, transient and variable objects both at low and high redshift, and the properties of normal and active galaxies at low and high redshift. It then turns to far-field cosmological topics, exploring properties of supernovae to z~1, strong and weak lensing, the large-scale distribution of galaxies and baryon oscillations, and how these different probes may be combined to constrain cosmological models and the physics of dark energy.Comment: 596 pages. Also available at full resolution at http://www.lsst.org/lsst/sciboo
    corecore